Deep CNN based feature extractor for text-prompted speaker recognition
نویسندگان
چکیده
Deep learning is still not a very common tool in speaker verification field. We study deep convolutional neural network performance in the text-prompted speaker verification task. The prompted passphrase is segmented into word states — i.e. digits — to test each digit utterance separately. We train a single high-level feature extractor for all states and use cosine similarity metric for scoring. The key feature of our network is the Max-Feature-Map activation function, which acts as an embedded feature selector. By using multitask learning scheme to train the high-level feature extractor we were able to surpass the classic baseline systems in terms of quality and achieved impressive results for such a novice approach, getting 2.85% EER on the RSR2015 evaluation set. Fusion of the proposed and the baseline systems improves this result.
منابع مشابه
شبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملTraffic Sign Recognition Using Extreme Learning Classifier with Deep Convolutional Features
Traffic sign recognition is an important but challenging task, especially for automated driving and driver assistance. Its accuracy depends on two aspects: feature extractor and classifier. Current popular algorithms mainly use convolutional neural networks (CNN) to execute both feature extraction and classification. Such methods could achieve impressive results but usually on the basis of an e...
متن کاملImproved Gender Independent Speaker Recognition Using Convolutional Neural Network Based Bottleneck Features
This paper proposes a novel framework to improve performance of gender independent i-Vector PLDA based speaker recognition using convolutional neural network (CNN). Convolutional layers of a CNN offer robustness to variations in input features including those due to gender. A CNN is trained for ASR with a linear bottleneck layer. Bottleneck features extracted using the CNN are then used to trai...
متن کاملEMG-based wrist gesture recognition using a convolutional neural network
Background: Deep learning has revolutionized artificial intelligence and has transformed many fields. It allows processing high-dimensional data (such as signals or images) without the need for feature engineering. The aim of this research is to develop a deep learning-based system to decode motor intent from electromyogram (EMG) signals. Methods: A myoelectric system based on convolutional ne...
متن کاملSpoof Detection for Finger-Vein Recognition System Using NIR Camera
Finger-vein recognition, a new and advanced biometrics recognition method, is attracting the attention of researchers because of its advantages such as high recognition performance and lesser likelihood of theft and inaccuracies occurring on account of skin condition defects. However, as reported by previous researchers, it is possible to attack a finger-vein recognition system by using present...
متن کامل